Data files

students.txt -> list of students with identifier for the student, year and university of graduation, identifier for the advisor, indicator for having a Chinese first name and last name, indicator for being a recipient of NSF graduate research fellowship. 
bibliometric.dta -> publication counts for each student, including counts of citations and adjustment for impact factor, and distinguished for first-authorship or all publications
ready.dta -> data ready for analysis, obtained by running data_preparation.do


Do files

data_preparation.do -> assemble ready.dta from students.txt and bibliometric.dta, label variables
paper_regressions.do -> generate the numbers reported in table 2 and 3 of the paper